Let’s Go To The Movies

Davin Dillon

2022-04-18

Paying for Oscars or Oscars for Pay?

Data Wrangling

There were many renames and some work to choose a movie budget in some cases. Once I had the columns I wanted, I set in on joining the data together. First, I joined the meta data with the budget data to include all of the titles for which I had information. My next adventure (or misadventure) was to join this data with the inflation data. Once that was done, all that was left was joining this information with the Oscar data. All in all, I used three full joins.

oscars <- oscars %>%    # rename year for join
  rename('year' = 'year_film')

meta$year <-  format(as.Date(meta$release_date, format = "%m/%d/%Y"), "%Y")

blue = '#000080'
  
meta <- meta %>% 
  mutate(year = as.numeric(year))

# renames
budget <- budget %>% 
  rename(vote_average = score) %>% # renaming vote data for joins etc
  rename(vote_count = votes) %>% 
  rename(Title = 'Movie Title')


oscars <- oscars %>% 
  rename(Title = 'film')  # rename for joins

adj <- inflation %>% 
  mutate(multiplier = (22.82/amount)) 
# create multiplier column for easy calculations

budget <- budget %>% 
  rename(new_budget = Budget)  # rename for joins etc


options(scipen = 100) # avoid scientific notation 

full_budget <- full_join(meta, budget, on = 'Title') 
## Joining, by = c("runtime", "Title", "vote_average", "vote_count", "year")
# full join of metadata and budget data to get new and old movies etc

full_bud <- full_budget %>% 
  mutate(budget = pmax(new_budget, meta_budget, na.rm = T)) %>% 
  select(Title,genres, budget,new_budget, meta_budget, popularity,year,
         release_date, revenue, runtime, vote_average,
         vote_count,gross)
# set budget to max of two different budgets.
# picking max is arbitrary, but needed in most cases


full_bud <- full_bud[-c(1,2,3),] %>%
   arrange(desc(as.numeric(budget)))
# remove first three unnecessary rows


# format(date, format="%Y")


full_bud <- full_bud %>% 
  mutate(year = replace(year, year == 1900, 2022))


adj_bud <- full_join(full_bud, adj,
                             on = c('Title', 'year')) %>% 
  mutate(with_inflation = (as.numeric(budget) * (multiplier))) %>% 
  mutate(gross_inflation = (as.numeric(full_bud$gross)
                            * (multiplier))) %>% 
  select(Title,genres,vote_average, vote_count, budget,
         with_inflation, gross_inflation, year, gross,
         release_date) %>% 
  arrange(desc(with_inflation))  
## Joining, by = "year"
osc_bud <-  full_join(adj_bud, oscars, on  = c('Title'))
## Joining, by = c("Title", "year")

Some Dataset Numbers

The data I was able to collect contains 4,834 movies nominated for an Academy Award since its inception in 1929. Of these 4,834 movies. 1,274 won at least one award. There have been 13,312 total Oscar nominations, and 1,274 total Oscar wins in the dataset. 559 movies have been nominated for Best Picture in its many forms. Out of these, 92 won. 1,154 movies have had an actor or actress nominated in either a leading or supporting role. 313 movies had at least one winner in an acting category.

Plotting Profit

Highest Percent Profit

Lowest Percent Profit

Who spends more money?

  • The average Oscar winner budget was $82,927,713.63.
  • The median Oscar winner budget was $52,732,524.55
  • The average Oscar loser budget was $73,041,808.78
  • The median Oscar loser budget was $49,789,090.91
  • The average non-nominated budget was $50,077,035.39.
  • The median non-nominated budget was 34,659,781.29

Who makes more money?

  • The average Oscar winner percent profit was 634.94.
  • The median percent profit for Oscar winners was 575.55
  • The average Oscar loser percent profit was 444.90.
  • The median percent profit for Oscar losers was 377.84.
  • The average non-nominated percent profit was 248.09.
  • The median percent profit for non-nominated movies was 161.21

Spent vs Made

Best Picture Budgets with Inflation

The minimum budget for a Best Picture winner with inflation was Marty with a cost of $3,674,769.95. The average budget for a Best Picture winner with inflation was$51,335,637.82. The maximum budget for a Best Picture winner with inflation was Titanic with a cost of $358,241,758.24.